Convolutive Prediction for Monaural Speech Dereverberation and Noisy-Reverberant Speaker Separation

نویسندگان

چکیده

A promising approach for speech dereverberation is based on supervised learning, where a deep neural network (DNN) trained to predict the direct sound from noisy-reverberant speech. This data-driven leveraging prior knowledge of clean patterns, and seldom explicitly exploits linear-filter structure in reverberation, i.e., that reverberation results linear convolution between room impulse response (RIR) dry source signal. In this work, we propose exploit within learning monaural framework. The key idea first estimate direct-path signal target speaker using DNN then identify signals are decayed delayed copies estimated signal, as these can be reliably considered reverberation. They either directly removed dereverberation, or used extra features another perform better dereverberation. To copies, underlying filter (or RIR) by efficiently solving regression problem per frequency time-frequency domain. We modify proposed algorithm separation reverberant conditions. State-of-the-art obtained REVERB, SMS-WSJ, WHAMR! datasets.

برای دانلود باید عضویت طلایی داشته باشید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-step linear prediction based speech dereverberation in noisy reverberant environment

A speech signal captured by a distant microphone is generally contaminated by reverberation and background noise, which severely degrade the automatic speech recognition (ASR) performance. In this paper, we first extend a previously proposed single channel dereverberation algorithm to a multi-channel scenario. The method estimates late reflections using multichannel multi-step linear prediction...

متن کامل

Separation and dereverberation performance of frequency domain blind source separation for speech in a reverberant environment

In this paper, we investigate the performance of an unmixing system obtained by frequency domain Blind Source Separation (BSS) based on Independent Component Analysis (ICA). Since ICA is based on statistics, i.e., it only attempts to make outputs independent, it is not easy to predict what is going on in a BSS system. We therefore investigate the detailed components in the processed signals of ...

متن کامل

Linear prediction modulation filtering for speaker recognition of reverberant speech

This paper proposes a framework for spectral enhancement of reverberant speech based on inversion of the modulation transfer function. All-pole modeling of modulation spectra of clean and degraded speech are utilized to derive the linear prediction inverse modulation transfer function (LP-IMTF) solution as a low-order IIR filter in the modulation envelope domain. By considering spectral estimat...

متن کامل

Ideal Ratio Mask Estimation Using Deep Neural Networks for Monaural Speech Segregation in Noisy Reverberant Conditions

Monaural speech segregation is an important problem in robust speech processing and has been formulated as a supervised learning problem. In supervised learning methods, the ideal binary mask (IBM) is usually used as the target because of its simplicity and large speech intelligibility gains. Recently, the ideal ratio mask (IRM) has been found to improve the speech quality over the IBM. However...

متن کامل

Multipitch Tracking for Noisy and Reverberant Speech

Abstract – Multipitch tracking in real environments is critical for speech signal processing. Determining pitch in reverberant and noisy speech is a particularly challenging task. In this paper, we propose a robust algorithm for multipitch tracking in the presence of both background noise and room reverberation. An auditory front-end and a new channel selection method are utilized to extract pe...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: IEEE/ACM transactions on audio, speech, and language processing

سال: 2021

ISSN: ['2329-9304', '2329-9290']

DOI: https://doi.org/10.1109/taslp.2021.3129363